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SUMMARY 

Colonization  of  the  fetal  and  infant  gut  microbiome 
resuits  in  dynamic  changes  in  diversity,  which  can 
impact  disease  susceptibility.  To  examine  the  rela¬ 
tionship  between  human  gut  microbiome  dynamics 
throughout  infancy  and  type  1  diabetes  (T1D),  we 
examined  a  cohort  of  33  infants  genetically  predis¬ 
posed  to  T1D.  Modeling  trajectories  of  microbial 
abundances  through  infancy  reveaied  a  subset  of  mi¬ 
crobial  relationships  shared  across  most  subjects. 
Although  strain  composition  of  a  given  species  was 
highly  variable  between  individuals,  it  was  stabie 
within  individuals  throughout  infancy.  Metaboiic 
composition  and  metaboiic  pathway  abundance  re¬ 


mained  constant  across  time.  A  marked  drop  in 
aipha-diversity  was  observed  in  T1D  progressors  in 
the  time  window  between  seroconversion  and  T1 D 
diagnosis,  accompanied  by  spikes  in  infiammation- 
favoring  organisms,  gene  functions,  and  serum  and 
stool  metabolites.  This  work  identifies  trends  in  the 
deveiopment  of  the  human  infant  gut  microbiome 
aiong  with  specific  aiterations  that  precede  T1D 
onset  and  distinguish  T1D  progressors  from  non- 
progressors. 

INTRODUCTION 

The  initial  colonization  of  the  human  gut  microbiota  begins  in 
utero  (Aagaard  et  al.,  2014)  and  is  strongly  influenced  by 
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T1D  Status 


Figure  1.  A  Cohort  to  Assess  the  Dynamics 
of  the  Developing  Human  Gut  Microbiota  in 
Infancy 

Individuals  are  represented  in  rows,  and  each 
point  is  a  stool  sample.  The  size  of  the  points 
non-converters  represents  the  number  of  serum  autoantibodies 
(0-5)  that  were  positive  at  the  time  of  the  sample 
collection.  See  also  Figure  SI. 
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microbial  exposures  at  birth  (Dominguez-Bello  et  al.,  2010).  The 
initial  seeding  and  development  of  this  community  may  have 
long-term  physiological  consequences.  Low-resolution  longi¬ 
tudinal  studies  in  14  infants  (Palmer  et  al.,  2007)  and  higher- 
resolution  studies  in  a  single  infant  (Koenig  et  al.,  2011)  have 
documented  the  gradual  increase  in  phylogenetic  diversity, 
nonrandom  community  assembly,  the  effects  of  introducing 
table  foods,  and  the  large  taxonomic  shifts  that  can  occur  during 
infancy.  High-resolution  multi’omic  studies  that  examine  the 
dynamics  of  infant  gut  microbiome  development  in  a  large,  lon¬ 
gitudinal  cohort  have  been  lacking,  though  one  recent  study  has 
shown  that  children  with  severe  acute  malnutrition  exhibit 
deoreased  “microbiota  maturity”  using  such  a  cohort  (Subrama- 
nian  et  al.,  2014).  Events  in  early  microbiome  development  may 
have  a  role  in  promoting  susceptibility  to  or  protection  from  dis¬ 
ease  later  in  life;  this  has  been  demonstrated  in  mice  (Cho  et  al., 
2012;  Coxetal.,  2014),  and  it  may  also  be  true  for  type  1  diabetes 
(T1D)  (Brown  et  al.,  2011;  Giongo  et  al.,  2011;  de  Goffau  et  al., 
2013). 

T1D  is  an  autoimmune  disorder  that  results  from  T  cell- 
mediated  destruction  of  the  insulin-producing  p  cells  of  the 
pancreatic  islets.  Although  approximately  70%  of  T1D  cases 
carry  predisposing  HLA  risk  alleles,  only  3%-7%  of  children 
with  those  alleles  develop  T1D  (Achenbach  et  al.,  2005),  sug¬ 
gesting  a  significant  nongenetic  component  to  the  disease. 
The  incidence  of  T1 D  has  been  increasing  rapidly  over  the  past 
few  decades,  particularly  in  the  youngest  age  groups  (0-4  years) 
(Harjutsalo  et  al.,  2008),  suggesting  a  significant  nongenetic 
component  to  the  disease.  The  incidence  of  T1D  is  particularly 
high  in  Finland,  where  1  in  120  children  develop  T1D  before  15 
years  of  age  (Knip  et  al.,  2005). 

Although  there  have  been  limited  human  studies  of  the  micro¬ 
biome  in  T1 D  to  date,  the  notion  that  T1D  pathogenesis  may  be 
influenced  by  microbial  exposures  has  been  well  established  in 
murine  models.  The  knockout  of  MyD88,  an  adaptor  down- 


Serum  Autoantibody 
Positivity  Count 

g  Stream  of  multiple  Toll-like  receptors 

involved  in  microbial  sensing,  in  the 
NOD  mouse  results  in  complete  protec¬ 
tion  from  diabetes  (Wen  et  al.,  2008). 
Further,  heterozygous  MyD88'^°^'^  NOD 
^  ^  mice,  which  normally  develop  robust  dis¬ 

ease,  are  protected  from  diabetes  when 
®  ®  exposed  from  birth  to  the  gut  microbiota 

of  a  MyD88-KO  NOD  donor  (Wen  et  al., 
2008).  Therefore,  disease  progression  in 
gg  the  NOD  mouse  is  driven  in  part  by  an 

exaggerated  innate  immune  response  to 
symbiotic  microbiota,  and  altering  the 
composition  of  the  microbiota  can  curtail 
this  response  and  prevent  disease.  Prospective  studies  are 
required  to  assess  whether  the  microbiota  is  similarly  involved 
in  human  T1D  progression;  however,  such  cohorts  are  exceed¬ 
ingly  difficult  to  build  (Brown  et  al.,  201 1 ;  Giongo  et  al.,  2011). 

Here,  we  assess  the  composition  of  the  gut  microbiota  in  a 
densely  sampled,  prospective,  longitudinal  cohort  of  33  HLA- 
matched  infants  followed  from  birth  until  3  years  of  age.  We 
use  this  unprecedented  sample  resolution  to  describe  the  dy¬ 
namics  and  stability  of  the  developing  miorobiome  in  the  infant 
gut  of  an  at-risk  T1D  cohort.  We  show  that  although  there  are 
significant  shifts  in  taxonomic  composition  overtime,  the  relative 
abundance  of  metabolic  pathways  within  individuals  remains 
remarkably  constant  throughout  infancy.  We  identify  a  25% 
drop  in  alpha-diversity  in  children  who  progress  to  T1D  com¬ 
pared  to  controls,  which  occurs  after  seroconversion  but  before 
disease  diagnosis,  and  identify  alterations  to  both  the  phyloge¬ 
netic  and  metabolic  pathway  composition  of  the  microbiome 
during  this  time  that  is  characteristic  of  a  proinflammatory  envi¬ 
ronment.  Our  results  demonstrate  significant  alterations  to  the 
gut  microbiome  in  T1D  progressors  prior  to  disease  onset. 

RESULTS 

Extensive  Characterization  of  the  Infant  Gut  Microbiota 
in  a  Longitudinal  Cohort 

To  characterize  the  development  of  the  infant  gut  microbiome 
and  the  relationship  between  the  gut  miorobiota  and  islet  auto¬ 
immunity  and  progression  to  T1 D,  we  assembled  a  prospective, 
longitudinal  collection  of  stool  samples  from  infants  at  risk  for 
disease  (Figure  1).  Infants  from  Finland  and  Estonia  were  re- 
oruited  at  birth  based  on  HLA  risk  genotyping  (Table  1  and  see 
Table  SI  available  online).  Parents  collected  their  infants’  stool 
at  approximately  monthly  intervals.  The  cohort  was  comprised 
of  33  infants,  1 1  of  whom  seroconverted  to  serum  autoantibody 
positivity  (referred  to  hereafter  as  “seroconverters”;  defined  as 
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Table  1.  Summary  of  Study  Cohort 


Subject  ID 

Country 
of  Origin 

T 1 D  Status 

HLA  Type 

Autoantibody  Positive 

Age  at 

Seroconversion 

(Days) 

Age  at 

T1D  Diagnosis 
(Days) 

T026177 

Estonia 

nonconverter 

DQB1  *0302/*0501  -DRB1  *0401 

N7A 

N7A 

N7A 

T02541 1 

Estonia 

nonconverter 

DQB1  *0302/*0501  -DRB1  *0404 

N7A 

N7A 

N7A 

T014292 

Estonia 

nonconverter 

DQA1*05/*03-DQB1  *027*0301 

N7A 

N7A 

N7A 

T012808 

Estonia 

nonconverter 

DQA1  *057*0201  -DQB1  *027*02 

N7A 

N7A 

N7A 

E02981 7 

Finland 

nonconverter 

DQB1  *03027*04-DRB1  *0404 

N7A 

N7A 

N7A 

E026325 

Finland 

nonconverter 

DQB1  *03027*0501  -DRB1  *0404 

N7A 

N7A 

N7A 

E022852 

Finland 

nonconverter 

DQB1  *03027*0501  -DRB1  *0404 

N7A 

N7A 

N7A 

E021406 

Finland 

nonconverter 

DQB1  *03027*0302-DRB1  *0401 7*0401 

N7A 

N7A 

N7A 

E018268 

Finland 

nonconverter 

DQB1  *03027*0604-DRB1  *0404 

N7A 

N7A 

N7A 

E01 7833 

Finland 

nonconverter 

DQA1*05-DQB1  *027*04 

N7A 

N7A 

N7A 

E017824 

Finland 

nonconverter 

DQB1  *03027*0604-DRB1  *0404 

N7A 

N7A 

N7A 

E016924 

Finland 

nonconverter 

DQA1*05-DQB1  *027*04 

N7A 

N7A 

N7A 

E013487 

Finland 

nonconverter 

DQA1  *057*03-DQB1  *027*0302-DRB1  *0401 

N7A 

N7A 

N7A 

E011279 

Finland 

nonconverter 

DQB1  *03027*04-DRB1  *0404 

N7A 

N7A 

N7A 

E010590 

Finland 

nonconverter 

DQB1  *03027*04-DRB1  *0401 

N7A 

N7A 

N7A 

E006673 

Finland 

nonconverter 

DQA1  *057*03-DQB1  *027*0302-DRB1  *0401 

N7A 

N7A 

N7A 

E006646 

Finland 

nonconverter 

DQB1  *03027*0501  -DRB1  *0401 

N7A 

N7A 

N7A 

E006547 

Finland 

nonconverter 

DQB1  *03027*0501  -DRB1  *0404 

N7A 

N7A 

N7A 

E004016 

Finland 

nonconverter 

DQB1  *03027*0502-DRB1  *0404 

N7A 

N7A 

N7A 

E003872 

Finland 

nonconverter 

DQB1  *03027*0501  -DRB1  *0404 

N7A 

N7A 

N7A 

E003061 

Finland 

nonconverter 

DQB1  *03027*0502-DRB1  *0405 

N7A 

N7A 

N7A 

E001463 

Finland 

nonconverter 

DQB1  *03027*04-DRB1  *0401 

N7A 

N7A 

N7A 

T013815 

Estonia 

seroconverter 

DQA1  *057*0201  -DQB1  *027*02 

I7\A,  GADA 

350.4 

N7A 

E026079 

Finland 

seroconverter 

DQB1  *03027*04-DRB1  *0401 

I7\A,  GADA 

580.35 

N7A 

E022137 

Finland 

seroconverter 

DQB1  *03027*0501  -DRB1  *0401 

I7\A,  GADA,  IA-2A, 
ZNT8A,  ICA 

562.1 

N7A 

E018113 

Finland 

seroconverter 

DQB1  *03027*04-DRB1  *0401 

I7\A,  GADA,  IA-2A, 
ZNT8A,  ICA 

587.65 

N7A 

E017751 

Finland 

seroconverter 

DQA1  *05-DQB1  *027*0604 

I7\A,  ICA 

175.2 

N7A 

E010629 

Finland 

seroconverter 

DQB1  *03027*0501  -DRB1  *0401 

I7\A,  GADA,  ZNT8A,  ICA 

945.35 

N7A 

E003989 

Finland 

seroconverter 

DQB1  *03027*04-DRB1  *0401 

I7\A,  GADA,  ZNT8A,  ICA 

346.75 

N7A 

T025418 

Estonia 

T 1 D  case 

DQA1  *0201 7*03-DQB1  *027*0302- 

DRB1*0404 

I7\A,  GADA,  IA-2A,  ICA 

540.2 

879.65 

E010937 

Finland 

T 1 D  case 

DQA1  *057*03-DQB1  *027*0302-DRB1  *0401 

I7\A,  IA-2A,  ZNT8A,  ICA 

905.2 

959.95 

E006574 

Finland 

T 1 D  case 

DQB1  *03027*0501  -DRB1  *0401 

I7\A,  GADA,  IA-2A, 
ZNT8A,  ICA 

532.9 

1,339.55 

E003251 

Finland 

T 1 D  case 

DQB1  *03027*0501  -DRB1  *0401 

I7\A,  GADA,  IA-2A, 

357.7 

1,168 

ZNT8A,  ICA 

See  also  Table  S1  and  Table  S2. 


being  positive  for  at  ieast  two  of  the  five  autoantibodies  anaiyzed 
in  this  study;  see  Experimentai  Procedures),  and  of  those,  four 
deveioped  T1D  within  the  time  frame  of  this  study  (referred  to 
as  “T1 D  cases”;  see  Tabie  1  and  Figure  1).  The  1 1  seroconvert- 
ers  were  matched  with  the  22  controis  for  gender,  H1_A  geno¬ 
type,  and  country. 

Sequencing  of  the  V4  region  of  the  1 6S  rDNA  gene  was  carried 
out  on  a  totai  of  989  sampies  using  paired-end,  partiaily  overiap- 
ping  reads  on  the  iliumina  MiSeq  V2  piatform  as  previousiy 
described  (Caporaso  et  ai.,  2012),  yieiding  a  very  high  depth  of 


sequencing  with  a  mean  of  65,076  reads  persampie.  Taxonomic 
profiiing  was  performed  using  QiiME  (Caporaso  et  ai.,  201 0),  and 
functionai  profiiing  of  microbiai  pathways  was  inferred  from  16S 
sequences  with  PiCRUSt  (Langilie  et  al.,  2013).  in  totai,  there 
were  777  unique  sampies  sequenced  by  16S,  with  a  median  of 
23  unique  sampies  per  individuai  (minimum  8,  maximum  34); 
the  fuii  OTU  tabie  is  avaiiabie  in  Table  S2.  Shotgun  metagenomic 
sequencing  was  performed  on  a  subset  of  124  sampies  from  19 
individuais,  inciuding  aii  11  seroconverters,  with  a  median  of  6 
sampies  per  individuai  (minimum  3,  maximum  11)  (Figure  SI). 
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The  median  depth  of  sequencing  was  2.5  Gb  per  sample.  Phylo¬ 
genetic  community  profiling  of  metagenomic  data  was  per¬ 
formed  using  MetaPhlAn  (Segata  et  al.,  2012),  and  functional 
profiling  of  microbial  pathways  was  characterized  with  HUMAnN 
(Abubucker  et  al.,  2012). 

In  addition  to  16S  and  metagenomic  sequencing,  serum  and 
stool  metabolomics  were  performed  on  the  cohort.  For  each  of 
the  33  participants,  7  serum  samples  taken  throughout  the 
experimental  time  frame  (Table  SI)  were  subject  to  metabolo¬ 
mics  and  lipidomics,  and  all  samples  that  were  used  for  shotgun 
metagenomics  were  also  analyzed  by  stool  metabolomics  and 
lipidomics  (see  Experimental  Procedures). 

Gut  Microbial  Metabolites  and  Functional  Pathways,  but 
Not  Taxonomies,  Are  Stable  throughout  Infant 
Development 

Principal  coordinates  analysis  of  the  Bray-Curtis  dissimilarity 
between  all  777  16S-sequenced  samples  revealed  that  age  is 
the  strongest  driver  of  the  composition  of  the  infant  gut  micro- 
biome  (Figure  2A).  Age  accounted  for  18%  of  the  variation 
between  samples,  and  showed  a  nearly  linear  gradient  diago¬ 
nally  along  the  first  and  second  principal  coordinates.  Similarly, 
the  Chad  alpha-diversity,  a  measure  of  the  number  of  distinct 
microbes  in  a  community,  exponentially  increased  in  early 
development  until  reaching  a  maximum  at  3  years  of  age 
(Figure  2B). 

We  hypothesized  that  with  increasing  taxonomic  diversity  in 
the  developing  gut  comes  an  equivalent  change  in  the  meta¬ 
bolic  composition  of  the  gut  community;  however,  this  was 
not  the  case.  The  stool  metabolomics  beta-diversity  distances 
between  samples  did  not  have  as  strong  of  an  age  trend  as 
did  taxonomies  (Figure  2C),  and  the  alpha-diversity  of  stool  me¬ 
tabolites  was  nearly  flat  across  time,  with  the  exception  of  a  few 
outlier  very-early-time  point  samples  that  had  a  much  lower  di¬ 
versity  (Figure  2D).  More  strikingly,  the  relative  abundance  of 
metabolic  modules  in  the  microbiome  remained  approximately 
constant  throughout  time  and  across  individuals  (Figure  2E 
shows  all  samples  sorted  by  age  across  all  individuals).  Essen¬ 
tially  all  pathways  are  encoded  by  all  individuals  from  the 
earliest  to  the  latest  time  points.  The  evenness  is  higher  in  the 
first  few  months  until  it  stabilizes  (Figure  2F).  This  may  be 
because  the  composition  of  the  microbiome  requires  time  to 
“settle”  into  its  most  optimal  abundance  of  functional  path¬ 
ways,  which  is  less  evenly  distributed  than  in  the  earliest  time 
points. 

These  results  demonstrate  a  remarkable  stability  in  the  meta- 
bolio  pathway  coding  potential— and  the  metabolic  content— of 
the  microbiome  despite  dramatic  shifts  of  taxonomio  composi¬ 
tion  throughout  human  infancy. 

A  Model  of  the  Dynamics  of  the  Developing  Gut 
Microbiome 

The  strength  of  the  age  effect  in  taxonomies  and  its  consistency 
across  individuals  suggested  that  there  may  be  closely  shared 
phylogenetio  trajectories  that  define  the  development  of  the 
gut  microbiome,  in  agreement  with  previous  cross-sectional 
studies  in  older  children  (Yatsunenko  et  al.,  2012).  To  investigate 
the  driving  forces  behind  this  effect,  we  performed  pairwise  cor¬ 
relations  of  the  trajectories  of  abundance  between  all  clades 


across  time  on  a  per-individual  basis,  excluding  T1 D  cases.  Cor¬ 
relations  were  determined  using  CCREPE  (Faust  et  al.,  2012),  a 
tool  designed  to  find  significant  correlations  in  sparse,  composi¬ 
tional  data  such  as  16S  sequencing  data,  which  are  prone  to 
spurious  correlations.  The  Z  score  for  each  clade-clade  pair 
was  summed  across  all  individuals  (excluding  T1D  cases), 
revealing  a  small  set  of  pairs  with  very  strong  correlations  that 
were  consistent  across  most  subjects.  This  allowed  us  to  pro¬ 
duce  a  network  of  the  dynamics  of  the  developing  gut  micro¬ 
biome  (Figure  3).  Plotting  the  corresponding  clade  abundances 
over  time  demonstrates  a  strongly  shared  dynamio  relationship 
across  time,  and  across  nearly  all  individuals,  for  many  specific 
clades  in  this  developmental  process.  The  resulting  network  at 
the  family  level  is  shown  in  Figure  3;  see  Figure  S2  for  other 
phylogenetic  levels. 

The  Infant  Gut  Microbiome  Remains  Stable  at  the 
Strain  Level 

Having  investigated  the  dynamics  of  the  infant  gut  microbiome, 
we  examined  its  strain-level  stability,  i.e.,  the  retention  of  mi¬ 
crobial  strains  across  time.  Using  the  shotgun  metagenomics 
data  available  on  124  samples,  we  analyzed  strain-level 
markers  on  a  per-species  and  per-individual  basis  using  Meta¬ 
PhlAn  (Segata  et  al.,  2012).  Analysis  was  restricted  to  species 
that  have  a  mean  of  at  least  1  x  coverage  across  all  time 
points  per  individual;  21  species  and  12  individuals  met  this 
requirement.  Using  an  unweighted  discordant  marker  distance 
metric  (see  Supplemental  Experimental  Procedures),  we  found 
that  samples  were  more  similar  in  the  intraindividual  versus 
interindividual  comparison  (p  <  1e-5;  Figure  4A),  suggesting 
that  the  strain  profile  for  a  given  species  is  more  similar  be¬ 
tween  samples  in  a  single  individual  than  between  samples 
in  two  people.  Surprisingly,  the  strain  profile  remained  essen¬ 
tially  constant  over  time  for  almost  all  species  and  in  almost 
all  individuals  (Figure  4B  shows  a  representative  example  in 
which  the  marker  abundance  is  constant  over  time  (individuals 
#2  and  #3).  In  a  rare  case  we  observed  a  shift  in  the  strain 
signature  at  a  specific  time  point  (individual  #1).  Table  S3 
shows  marker  abundances  for  all  individuals  with  sufficient 
marker  coverage. 

We  investigated  community  stability  by  calculating  the 
Jaccard  index,  which  is  defined  as  the  fraction  of  shared  opera¬ 
tional  taxonomic  units  (OTUs),  between  all  pairs  of  samples 
within  an  individual  in  specified  time  windows  (Figure  4C).  For 
instance,  we  calculated  the  fraction  of  shared  OTUs  within  a 
subject  between  two  samples  collected  approximately  3  months 
apart,  and  found  that  the  Jaccard  index  is  significantly  higher 
than  for  two  samples  from  the  same  individual  collected 
6  months  apart.  As  has  been  observed  in  an  adult  population 
(Faith  et  al.,  2013),  we  found  that  the  Jaccard  index  followed  a 
power-law  function  (Figure  40,  line).  The  curve  reached  an 
asymptote  at  a  value  of  approximately  0.1,  suggesting  that 
about  1 0%  of  bacterial  strains  (observed  here  at  an  OTU-level 
resolution)  were  maintained  in  the  infant  gut  from  birth  until  3 
years  of  age  (Figure  40).  This  surprising  result  demonstrates 
that  although  there  is  tremendous  variability  in  the  gut  micro¬ 
biome  through  infancy,  the  community  at  3  years  of  age  retained 
a  nonnegligible  fraction  of  members  that  it  acquired  just  after 
birth. 
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Figure  2.  Gut  Microbial  Taxonomies  Shift  Dramatically,  whereas  Microbial  Metabolites  and  Metabolic  Pathways  Remain  Relatively  Stable 
throughout  Infant  Development 

(A)  Principal  coordinates  analysis  on  the  unweighted  UniFrac  distances  between  samples  based  on  16S  sequencing.  Samples  are  colored  by  age  at  stool 
collection. 

(B)  Alpha-diversity  using  the  QIIME  “observed  species”  metric  on  16S  sequencing. 

(C)  Principal  coordinates  analysis  on  stool  metabolomics  data. 

(D)  Shannon’s  diversity  measured  on  stool  metabolomics  data. 

(E)  Bars  indicate  relative  abundances  of  KEGG  metabolic  modules:  A,  aminoacyl  tRNA;  B,  arginine  and  proline  metabolism;  C,  aromatic  amino  acid  metabolism; 
D,  branched-chain  amino  acid  metabolism;  E,  carbon  fixation;  F,  central  carbohydrate  metabolism;  G,  cofactor  and  vitamin  biosynthesis;  H,  cysteine  and 
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Figure  3.  Temporal  Dynamics  of  Microbial  Taxonomies  in  Infant  Gut  Development 

Family-level  network  diagram  of  the  correlation  between  clades  in  their  trajectories  across  time,  excluding  individuals  with  T1 D.  Positive  correlations  are  in  blue, 
negative  correlations  are  in  red,  and  the  line  thickness  is  proportional  to  the  strength  of  the  correlation  (cumulative  CCREPE  Z  statistic).  The  plots  show  the 
abundance  of  the  indicated  family  as  a  smoothing  spline  across  all  healthy  individuals  with  a  95%  confidence  interval  (shaded  region).  See  also  Figure  S2. 


Correlations  between  the  Gut  Microbiome  and  Diet  and 
Environmental  Factors 

Extensive  metadata  relating  to  both  clinical  and  nonclinical  fac¬ 
tors  were  collected  for  each  participant  in  the  study  (Table  S1), 
allowing  us  to  assess  the  association  between  the  gut  micro¬ 
biome  and  environmental  factors  in  our  cohort.  To  avoid  the  po¬ 
tential  confounding  effects  of  age,  multiple  sampling  from  the 
same  individual,  and  each  of  the  other  metadata,  all  analyses 
were  performed  on  a  reduced  set  of  samples  in  a  limited  time 
frame,  using  age  and  other  metadata  as  fixed  effects  and  subject 
identity  as  a  random  effect.  This  analysis  was  performed  using 
multivariate  association  with  linear  models  (MaAsLin)  (Morgan 
et  al.,  2012),  an  additive  general  linear  model  with  boosting 
that  can  capture  the  effects  of  a  parameter  of  interest  while  de- 
confounding  the  effects  of  other  metadata.  This  is  particularly 


important  in  the  current  study,  as  age,  diet,  and  other  factors 
are  expected  to  have  strong  influences  on  community  composi¬ 
tion  (Table  SI;  see  Experimental  Procedures  for  the  metadata 
included  in  the  MaAsLin  analysis).  With  MaAsLin,  we  focused 
our  analysis  on  a  single  variable  of  interest,  and  systematically 
“subtracted  out”  the  effect  of  each  of  the  other  potentially  con¬ 
founding  metadata  variables.  A  series  of  five  samples  from  each 
breastfed  subject  taken  during  and  after  cessation  of  breast¬ 
feeding  revealed  an  increase  in  Bifidobacterium  and  Lactoba- 
ciilus  species  during  breastfeeding;  however,  we  found  that  the 
reduction  in  Lachnospiraceae  was  an  even  stronger  effect  (Fig¬ 
ure  S3A).  We  observed  substantial  differences  between  Estonian 
and  Finnish  infants,  including  significantly  higher  levels  of 
Bacteroides  and  Streptococcus  species,  which  contain  a  num¬ 
ber  of  potential  pathobionts,  in  the  Estonians  (Figure  S3B). 


methionine  metabolism;  I,  fatty  acid  metabolism;  J,  glycosaminoglycan  metabolism;  K,  histidine  metabolism;  L,  lipid  metabolism;  M,  lipopolysaccharide 
metabolism;  N,  lysine  metabolism;  O,  methane  metabolism;  P,  nitrogen  metabolism;  Q,  nucleotide  sugar;  R,  other  amino  acid  metabolism;  S,  other  carbohydrate 
metabolism;  T,  polyamine  biosynthesis;  U,  purine  metabolism;  V,  pyrimidine  metabolism;  W,  serine  and  threonine  metabolism;  X,  sulfur  metabolism;  and  Y, 
terpenoid  backbone  biosynthesis. 

(F)  A  measure  of  evenness  of  KEGG  metabolic  modules. 
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Figure  4.  Bacterial  Strains  Are  Stably  Maintained  in  the  Infant  Gut  throughout  Development 

(A)  Distance  between  samples  between  subjects  (interindividual)  and  within  subjects  (intraindividual)  based  on  MetaPhlAn  clade-specific  strain  marker 
analysis. 

(legend  continued  on  next  page) 
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Additionally,  we  observed  specific  shifts  in  phylogenetic  abun¬ 
dance  with  several  other  dietary  parameters:  eggs,  barley,  soy, 
and  fish  (nonsignificanf)  (Figure  S3C).  Notably,  these  shifts  are 
less  significant  than  those  associated  with  geography  or  breast¬ 
feeding.  Although  we  included  antibiotic  usage  as  a  fixed  effect 
for  all  MaAsLin  analyses  in  this  study,  we  did  not  have  sufficient 
annotation  on  the  timing  of  antibiotic  usage  to  identify  commu¬ 
nity  shifts  associated  with  antibiotics.  We  did  not  find  differences 
in  community  composition  between  cesarean  section  versus 
vaginally  delivered  infants,  perhaps  because  our  cohort  included 
only  three  cesarean-delivered  subjects. 

Gut  Microbiota  Composition  Distinguishes  T1D  Status 

We  next  examined  whether  there  were  features  of  the  microbial 
community  that  could  distinguish  T1D  disease  state.  We 
assessed  Chad  alpha-diversity  across  time  in  nonconverter 
(not  serocon verted),  seroconverted  (not  diagnosed  with  T1D), 
and  T1D  cases  (seroconverted  subjects  also  diagnosed  with 
T1D).  We  observed  a  pronounced  flattening  of  the  alpha- 
diversity  in  T1D  subjects  at  a  time  when  the  gut  communities 
of  the  nonconverter  and  seroconverted  individuals  continued 
to  rise  in  alpha-diversity  (Figure  5A).  This  result  was  significant 
by  permutation  test  on  subject  labels  (p  <  0.025;  Figure  S3D). 
Intriguingly,  this  divergence  in  alpha-diversity  occurred  after 
the  time  period  in  which  most  subjects  seroconverted,  but 
before  the  progressors  presented  with  clinical  disease. 

To  investigate  the  specific  changes  to  the  community  that 
accounted  for  the  decreased  alpha-diversity  in  T1D  subjects, 
we  used  MaAsLin  analysis  to  focus  on  the  time  of  the  alpha- 
diversity  divergence,  after  600  days  of  age,  and  observed  a 
number  of  significant  alterations  that  distinguish  T1D  cases 
from  nonconverters  and  seroconverters  (Figure  5B).  After  cor¬ 
recting  for  potenfial  confounding  variables,  we  found  that  the 
drop  in  alpha-diversity  in  T1D  cases  could  be  accounted  for  by 
the  relative  overabundance  of  a  few  groups:  Blautia,  the  Rikenel- 
laceae,  and  the  Ruminococcus  and  Streptococcus  genera 
(not  statistically  significant).  These  groups  of  bacteria  contain 
species  that  have  been  characterized  as  “pathobionts”  (Chow 
and  Mazmanian,  2009),  which  are  members  of  the  commensal 
microbiota  that  have  the  capacity  to  behave  as  pathogens.  Our 
shotgun  metagenomic  sequencing  revealed  that  specific  patho- 
biont-like  species  within  these  groups,  such  as  Ruminococcus 
gnavus  and  Streptococcus  infantarius,  showed  a  spike  in  relative 
abundance  within  the  T1D  cases  at  the  time  of  alpha-diversity 
divergence  (Figure  5C).  Conversely,  we  saw  a  relative  under¬ 
abundance  of  a  few  groups  of  bacteria  that  are  commonly 
depleted  in  the  inflammatory  state,  namely  the  Lachnospiraceae 
and  Veillonellaceae  (not  statistically  significant)  (Figure  5B),  and 
metagenomic  sequencing  showed  the  complete  absence  of  a 
number  of  these  species,  such  as  Coprococcus  eutactus  and 
Dialister  invisus,  in  T1D  cases  (Figure  5C).  Remarkably,  sero¬ 
converters  had  an  intermediate  abundance  of  all  of  these  groups 


of  organisms  between  nonconverters  and  T1 D  cases  (Figure  5B), 
providing  further  evidence  that  this  shift  in  microbiome  composi¬ 
tion  is  linked  to  the  T1D  disease  state. 

Gut  Microbial  Gene  Content  Is  Altered  Prior  to  Clinical 
Onset  of  T1D 

The  NIH  Human  Microbiome  Project  has  shown  significant 
stability  in  microbial  metabolic  pathways  across  individuals 
despite  high  variability  in  taxonomic  composition  (Human  Micro¬ 
biome  Project  Consortium,  2012).  We  investigated  whether 
changes  in  the  abundance  of  specific  metabolic  pathways 
correlated  with  T1 D  status.  After  correcting  for  the  effects  of  con¬ 
founding  variables  such  as  age  and  diet  using  MaAsLin,  we 
found  significant  shifts  that  occurred  within  T1D  cases  including 
an  increase  in  the  multiple  sugar  transport  system,  which  is 
involved  in  the  utilization  of  D-galactose,  D-xylose,  L-arabinose, 
D-glucose,  and  D-mannose,  and  a  decrease  in  the  biosynthesis 
of  a  number  of  amino  acids  (Figure  6A).  A  shift  in  functional 
potential  from  the  synthesis  of  nutrients  to  the  passive  transport- 
ing-in  of  nutrients  is  characteristic  of  auxotrophic  organisms. 
Auxotrophs  thrive  in  inflammatory  environments  where  dead 
tissue  provides  easy  access  to  many  nutrients  that  are  less  avail¬ 
able  in  the  healthy  gut  (Morgan  et  al.,  2012).  As  was  found  for 
T1  D-associated  phylogenies,  seroconverters  had  an  intermedi¬ 
ate  level  of  abundance  between  nonconverters  and  T1D  cases 
in  metabolic  pathway  carriage  (Figure  6A),  and  were  more  similar 
to  nonconverters  than  T1D  cases. 

Serum  and  Gut  Lipids  and  Metabolites  Relevant  to 
Disease  Are  Correlated  with  T1  D-Associated  Microbial 
Taxa 

The  physiological  effects  of  the  gut  microbiota  extend  beyond 
the  gut;  there  is  an  interplay  between  both  host  and  microbial 
enzymes  and  their  metabolites  which  impacts  host  metabolism 
(Velagapudi  et  al.,  2010)  and  mucosal  immunity  (Smith  et  al., 
2013),  as  well  as  diseases  including  cardiovascular  disease 
(Koeth  et  al.,  2013;  Wang  et  al.,  201 1).  We  assessed  the  correla¬ 
tion  between  serum  polar  metabolites  and  lipids  with  the  gut 
microbiota.  A  Spearman  correlation  between  absolute  abun¬ 
dances  of  metabolites  and  microbial  relative  abundances 
yielded  several  metabolite-microbe  clusters  (Figure  S4).  Most 
significantly,  we  observed  a  clustering  of  triglycerides  with  a 
number  of  microbes,  including  a  positive  correlation  between 
Blautia  and  long-chain  triglycerides  and  Ruminococcus  with 
short-chain  triglycerides,  and  a  negative  correlation  between 
Veilloneila  and  short-chain  triglycerides  (Figure  6B).  At  the  OTU 
level,  we  also  observed  correlations  between  members  of  these 
genera  with  branched-chain  amino  acids,  specifically  a  positive 
correlation  with  Blautia  and  Ruminococcus  members  and  a 
negative  correlation  with  a  Veilionella  member  (Figure  6C). 
Altered  levels  of  serum  friglycerides  are  a  common  feature  of 
obesity  and  type  2  diabetes,  and  hypertriglyceridemia  is 


(B)  Shown  is  the  MetaPhlAn  clade-specific  strain  marker  profile  for  a  single  representative  species  {Bacteroides  ovatus)  for  three  separate  individuals.  Columns 
represent  the  37  markers  for  this  species,  rows  represent  samples,  and  arrows  indicate  discordant  markers  between  individuals  2  and  3  and  indicate  markers  that 
undergo  a  change  in  abundance  in  individual  1. 

(C)  The  Jaccard  index  (fraction  of  shared  OTUs)  between  pairs  of  samples  from  the  same  individual  within  the  indicated  time  window  (i.e.,  1.5  indicates 
0-1 .5  months,  6  indicates  4.5-6  months).  The  Jaccard  index  is  shown  for  all  pairs  of  samples  across  all  subjects.  A  power-law  curve  was  fitted  to  the  medians  of 
the  boxplots  (line).  The  box  represents  the  first  and  third  quartiles,  and  error  bars  indicate  95%  confidence  of  median.  See  also  Table  S3. 
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Figure  5.  The  gut  Microbiota  Distinguishes  Disease  Status  in  T1D  prior  to  Diagnosis 

(A)  Plot  of  Chaof  alpha-diversity  across  time,  represented  as  a  smoothing  spline  with  a  95%  confidence  interval  (shaded  region).  The  seroconversion  window 
indicates  the  first  and  third  quartiies  for  age  at  seroconversion  for  all  seroconverted  and  T 1  D-diagnosed  individuals,  and  the  diagnosis  window  indicates  the  first 
quartile  of  time  at  T1 D  diagnosis  (third  quartile  is  1 ,21 1  days). 

(B)  Abundances  of  the  significantly  differentially  abundant  taxa  between  T1D  versus  nonconverter  and  seroconverted  individuals,  including  only  samples 
between  the  seroconversion  and  diagnosis  windows.  FDR-corrected  p  values  (Q  values)  were  calculated  using  MaAsLin.  The  box  represents  the  first  and  third 
quartiies;  error  bars  indicate  95%  confidence  of  median. 

(C)  Plots  of  the  relative  abundance  of  representative  species  from  shotgun  metagenomics  data,  represented  as  a  smoothing  spline  with  a  95%  confidence  interval 
(shaded  region).  See  also  Figures  S3  and  S5. 


associated  with  poor  glycemic  controi  and  nephropathy  in  T1D 
(Aicantara  et  ai.,  2011;  Verges,  2009).  Additionaiiy,  eievated 
branched-chain  amino  acids  have  been  shown  in  both  patients 
(Vannini  et  ai.,  1982)  and  in  mouse  modeis  (Mochida  et  ai., 
2011;  Saiier  et  al.,  2013)  of  diabetes,  as  weii  as  preceding  isiet 
autoimmunity  in  chiidren  who  iater  progress  to  T1D  (Oresic 
et  ai.,  2008).  We  found  a  positive  correiation  between  Blautia 
and  Ruminococcus,  both  of  which  have  increased  abundance 
in  T1D  cases,  with  trigiycerides  and  branched-chain  amino 
acids,  possibiy  indicating  that  these  microbe-metaboiite  reia- 
tionships  cooperativeiy  impact  T1D  progression. 

To  conduct  an  integrative  anaiysis  of  the  correiations  that 
exist  between  the  gut  microbiome  and  the  gut  (stooi)  metabo- 
iome,  we  performed  penaiized  canonicai  correiation  anaiysis 


(Figures  6D;  see  Experimentai  Procedures).  This  anaiysis  identi¬ 
fied  a  canonicai  variate  that  associates  increased  Ruminococ¬ 
cus  and  decreased  Veillonella  abundance  with  increased  sphin- 
gomyeiin  and  decreased  iithochoiic  acid  ieveis  (Pearson  R  = 
0.61;  Q  =  0.03).  Sphingomyeiin  is  a  member  of  the  sphingoli- 
pids,  which  inhibit  intestinai  natural  killer  T  cell  function  and  pro¬ 
tect  against  oxazolone-induced  colitis  (An  et  al.,  2014).  Litho- 
cholic  acid,  similar  to  deoxycholic  acid,  is  a  secondary  bile 
acid  that  promotes  intestinal  inflammation  by  eliciting  reactive 
oxygen  and  nitrogen  species  and  activating  NF-kB  activity  in  in¬ 
testinal  epithelial  cells  (Lee  et  al.,  2004;  Muhibauer  et  al.,  2004; 
Payne  et  al.,  2007;  Sears  and  Garrett,  2014;  Da  Silva  et  al., 
2012).  Although  the  alterations  to  the  microbiota  that  we 
observed  may  be  related  to  impaired  glucose  metabolism  in 
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the  prediabetic  stage,  these  resuits  suggest  that  the  T1  D-asso- 
ciated  microbiota  that  becomes  estabiished  prior  to  disease 
onset  may  actively  promote  a  metabolic  environment  in  the 
gut  that  is  permissive  to  inflammation  and  promotes 
pathogenesis. 

DISCUSSION 

To  identify  and  understand  alterations  to  the  gut  microbial  com¬ 
munity  composition  that  may  contribute  to  childhood  disease, 
we  must  first  investigate  the  normal  dynamics  of  the  community 
in  the  developing  infant.  Here,  we  identify  a  set  of  principles  that 
describe  microbiome  development  in  the  infant  gut.  We  note  as  a 
caveat  that  all  children  in  this  cohort  carry  T1  D-predisposing 
HLA  alleles  and  are  restricted  to  the  countries  of  Finland  and 
Estonia,  and  are  therefore  not  necessarily  representative  of 
genetically  “normal”  infants  in  other  regions  of  the  world.  First, 
although  there  is  great  variation  in  overall  taxonomic  com¬ 
position  between  and  within  individuals  over  time,  there  is  sig¬ 
nificantly  less  variation  in  the  metabolic  composition  of  the 
microbiome,  and  almost  no  variation  in  its  metabolic  pathway 
coding  potential.  This  result  provides  a  variation  on  the  finding 
made  in  the  NIH  Human  Microbiome  Project  regarding  the  stabil¬ 
ity  of  metabolic  pathways  in  the  microbiome  between  healthy 
adults  (Human  Microbiome  Project  Consortium,  2012)  and  sug¬ 
gests  that  the  relative  proportions  of  bacterial  functional  path¬ 
ways  remains  the  same  from  soon  after  birth  until  3  years  of 
age.  We  speculate  that  because  the  taxonomic  composition  of 
the  microbiome  stabilizes  at  approximately  3  years,  functional 
pathways  likely  remain  stable  for  long  after  this  age  as  well. 

Second,  we  identified  shared  taxonomic  trajectories,  remark¬ 
ably  consistent  across  individuals,  that  indicate  general  changes 
in  abundance,  the  timing  of  these  shifts,  and  the  relationships 
between  community  members.  For  example,  we  saw  a  strong 
positive  correlation  between  the  Lachnospiraceae  and  Rumino- 
coccaceae,  both  Gram-positive  anaerobes  that  are  inversely 
correlated  with  the  Enterobacteriaceae,  Gram-negative  aer¬ 
obes.  In  turn,  the  Enterobacteriaceae  are  positively  correlated 
with  the  Bifidobacteriaceae,  which  decrease  in  abundances 
after  cessation  of  breastfeeding.  Although  there  are  many  ex¬ 
ceptions  to  general  trends,  we  observed  a  decrease  in  Gram¬ 
negative  bacteria  over  time,  and  found  that  early  colonizers  are 
aerobic,  whereas  later  colonizers  tend  to  be  anaerobic.  Similar 
trends  have  been  observed  previously  (Dominguez-Bello  et  al., 
2010;  Koenig  et  al.,  2011;  Palmer  et  al.,  2007);  however,  the 
significantly  higher  density  of  sampling,  size,  and  longitudinal 
nature  of  our  cohort  provide  a  high-resolution  map  of  these 
dynamics  and  demonstrate  how  universal  they  are  across 
infants. 

Third,  we  demonstrated  a  surprising  stability  in  the  mainte¬ 
nance  of  specific  strains  through  time.  Although  the  strain 
composition  is  quite  distinct  between  individuals,  as  expected, 
the  strain  composition  within  an  individual  remains  essentially 
constant  throughout  infancy  for  almost  all  individuals  and  almost 
all  species  that  have  sufficiently  high  abundance  for  stability 
analysis. 

In  addition  to  shared  trends,  we  identified  aspects  of  infant  gut 
microbiome  development  that  are  unique  to  the  T1 D  state.  We 
observed  a  significant  alteration  in  the  structure  of  the 


T1  D-associated  gut  microbiome:  a  relative  25%  reduction  in 
alpha-diversity  compared  to  nonconverters  and  seroconverters, 
associated  with  shifts  in  both  microbial  phylogenetic  and  meta¬ 
bolic  pathways.  Importantly,  this  shift  is  seen  in  children  who  are 
diagnosed  with  T 1 D  within  the  study  time  frame,  but  not  in  sero¬ 
converters  without  disease.  Although  the  probability  of  progres¬ 
sion  to  T1 D  after  positivity  for  two  islet  autoantibodies  is  greater 
than  80%  after  follow-up  for  15  years  (Ziegler  et  al.,  2013),  there 
is  significant  variability  in  when  progression  occurs,  ranging  from 
weeks  to  more  than  two  decades  (Knip  et  al.,  201 0),  and  the  fac¬ 
tors  contributing  to  this  variability  are  not  well  understood.  The 
logistics  of  densely  sampling  a  large  cohort  of  individuals 
through  T1D  diagnosis  limits  the  time  frame  of  such  a  study, 
and  we  therefore  are  reporting  on  a  special  subset  of  T1D  cases 
with  early  onset  diabetes  (EOD)  (Harjutsalo  et  al.,  2013).  We  pro¬ 
vide  evidence  that  pronounced  alterations  occur  in  the  gut  mi¬ 
crobiome  that  precede  overt  disease. 

Although  previous  studies  of  human  cohorts  have  been  con¬ 
strained  by  the  availability  of  sufficient  longitudinal  samples 
and  subject  groupings  that  distinguished  seroconverting 
non-progressors  from  T1D  progressors,  they  have  shown  a 
decreased  microbial  diversity  in  children  with  long-lasting  |5 
cell  autoimmunity  and  in  progressors  to  clinical  T1D  compared 
to  nonseroconverted  controls  (Brown  et  al.,  2011;  Giongo 
et  al.,  2011;  de  Goffau  et  al.,  2013).  Here,  we  demonstrated 
that  this  shift  occurs  prior  to  onset  of  disease  but  after  serocon¬ 
version,  and  identified  that  it  is  specific  to  T1D  progressors  and 
not  seen  in  seroconverters  without  disease.  Decreased  microbial 
diversity  is  a  hallmark  of  dysbiosis  and  has  been  observed  in 
obesity  (Turnbaugh  et  al.,  2009),  inflammatory  bowel  disease 
(Manichanh  et  al.,  2012),  and  Clostridium  d/ffic/te-associated 
diarrhea  (Chang  et  al.,  2008).  A  recent  study  showed  that  a  failure 
to  establish  a  critical  level  of  diversity  in  the  gut  microbiota  of 
developing  mice  resulted  in  long-term  increases  in  IgE  levels, 
thus  predisposing  mice  to  immune-mediated  disorders  (Cahen- 
zli  et  al.,  2013).  Decreased  diversity  results  from  the  blooming  of 
a  small  subset  of  the  community  that  crowds  out  other  commu¬ 
nity  members. 

Additionally,  we  find  higher  levels  of  human  |3-defensin  2 
(hBD2)  in  early  samples  of  children  who  develop  T1D  (Fig¬ 
ure  S5).  hBD2  is  an  antimicrobial  product  induced  by  colonic 
epithelial  cells  during  inflammation  (O’Neil  et  al.,  1999;  Weh- 
kamp  et  al.,  2005);  therefore,  this  result  is  supportive  of 
increased  intestinal  inflammation  in  the  cohort  of  children  who 
go  on  to  develop  T1 D.  It  has  been  proposed  that  an  aberrant 
gut  microbiota,  a  permeable  intestinal  mucosal  barrier,  and  an 
altered  mucosal  immune  response  collectively  contribute  to 
the  development  of  T1D  (Vaarala  et  al.,  2008).  Our  results 
prompt  further  functional  studies  to  determine  whether  the 
proinflammatory  microbiome  we  observe  to  bloom  prior  to  clin¬ 
ical  disease  onset  may  take  advantage  of  or  drive  increased  in¬ 
testinal  permeability  and  intestinal  inflammation  to  contribute  to 
T1D  pathogenesis. 

EXPERIMENTAL  PROCEDURES 
Study  Cohort 

Please  see  Supplemental  Experimental  Procedures  for  cohort  recruitment  and 
sample  and  information  collection  details. 
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Stool  Sample  Collection  and  DNA  Extraction 

Stool  samples  were  collected  by  participants’  parents  and  stored  in  the  house¬ 
hold  freezer  (-20°C)  until  the  next  visit  to  the  local  study  center;  samples  were 
then  shipped  on  dry  ice  to  the  DIABIMMUNE  Core  Laboratory.  The  samples 
were  then  stored  at  — 80°C  until  shipping  to  the  Broad  Institute  for  DNA  extrac¬ 
tion.  DNA  extractions  from  stool  were  carried  out  using  the  QIAamp  DNA  Stool 
Mini  Kit  (QIAGEN,  Inc.,  Valencia,  CA,  USA). 

Sequencing  and  Analysis  of  the  16S  Gene  and  Shotgun 
Metagenomics 

16S  sequencing  and  metagenomics  was  performed  essentially  as  previously 
described  (Gevers  et  al.,  2014).  Additional  details  are  available  in  the  Supple¬ 
mental  Experimental  Procedures. 

Phylogenetic  Abundance  Trajectory  Network  Analysis 

Analysis  was  limited  to  the  29  individuals  without  T1 D.  Samples  from  the  1 6S 
OTU  abundance  table  were  binned  into  20  time  windows  from  50  to 
1,100  days,  selecting  the  nearest  sample  in  time  for  each  bin.  Read  counts 
from  the  1 6S  OTU  abundance  table  were  collapsed  at  each  phylogenetic  level, 
from  phylum  to  genus,  and  compositionally  normalized  such  that  the 
abundance  in  each  sample  sums  to  one.  At  each  phylogenetic  level,  on  a 
per-individual  basis,  the  correlation  between  every  clade-clade  pair  was  per¬ 
formed  in  CCREPE  (http;//huttenhower.sph. harvard.edu/ccrepe:  previously 
known  as  "ReBoot”;  Faust  et  al.,  2012).  The  default  similarity  metric, 
Spearman,  was  used.  Only  correlations  with  a  Q  value  <  0.1  were  included 
in  analysis.  For  each  filtered  clade-clade  pair,  the  Z  statistic  was  summed 
across  all  29  individuals,  and  only  pairs  with  a  cumulative  absolute  Z  statistic 
value  of  >20  were  carried  forward,  as  this  was  a  conservative  cutoff  for  consis¬ 
tent  correlations  across  many  subjects.  The  cumulative  Z  statistic  was  scaled 
without  centering  using  the  R  “scale”  function  and  then  visualized  as  a  network 
diagram  on  Cytoscape. 

Human  3-Defensin  2  Measurement  from  Stool  Samples 

Frozen  stool  samples  were  thawed  at  room  temperature  immediately  prior  to 
analysis.  Fecal  human  p-defensin  2  (hBD2)  levels  were  determined  by  enzyme 
linked  immunoabsorbent  assay  (ELISA)  using  the  p-defensin  2  ELISA  Kit  (Im- 
mundiagnostik,  Bensheim,  Germany)  adapted  for  fecal  samples  as  described 
previously  (Kapel  et  al.,  1999). 

Metabolomics  and  Lipidomics  Profiling  from  Serum  and  Stool 

Please  see  Supplemental  Experimental  Procedures  for  detailed  metabolomics 
and  lipidomics  protocols  and  sample  handling  information. 

Community  Stability  Analysis 

The  Jaccard  index  for  a  given  sample  pair  is  defined  as  (sample  A  fl  sample  B)/ 
(sample  A  U  sample  B)  and  calculated  using  the  compositionally  normalized 
16S  OTU  table.  On  a  per-individual  basis,  the  Jaccard  index  was  calculated 
for  all  samples  that  fell  into  time  windows  of  1.5  months  in  length,  beginning 
at  0-1 .5  months  up  to  31 .5-33  months. 

MaAsLin  Analysis 

MaAsLin  analysis  was  performed  using  default  parameters  (http:// 
huttenhower.sph.harvard.edu/maaslin).  Subject  ID  was  used  as  a  random 
effect.  The  following  variables  were  used  as  fixed  effects  for  every  analysis, 
in  addition  to  the  variable  of  interest:  T1 D  status,  age,  gender,  country,  delivery 
mode,  time  and  name  of  antibiotic  exposure,  total  reads  per  sample, 
sequencing  batch  ID,  breastfeeding  (on/off),  solid  food  (on/off),  eggs  (on/off). 


fish  (on/off),  soy  products  (on/off),  rye  (on/off),  barley  (on/off),  and  buckwheat 
and  millet  (on/off). 

Alpha-Diversity 

Alpha-diversity  analysis  of  the  16S  OTU  table  was  performed  in  QIIME  1 .5.0 
(Caporaso  et  al.,  2010)  with  the  alpha_diversity.py  script  using  the  "chad” 
metric  and  default  parameters.  Permutation-based  analysis  of  significance 
was  performed  on  a  per-subject  basis  by  shuffling  the  T1D  subject  label 
through  all  individuals  and  recalculating  the  difference  in  Chad  mean  between 
T1D  subjects  versus  control  and  seroconverted  subjects.  Ten  thousand 
permutations  were  performed. 

SUPPLEMENTAL  INFORMATION 

Supplemental  Information  includes  Supplemental  Experimental  Procedures, 
five  figures,  and  three  tables  and  can  be  found  with  this  article  at  http://dx. 
doi.org/1 0.1 01 6/j.chom.201 5.01 .001 . 
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Figure  6.  Gut  Microbial  Gene  Content  and  Serum  and  Gut  Metabolites  Are  Altered  prior  to  T1D  Onset 

(A)  Abundances  of  the  significantly  differentially  abundant  KEGG  modules  between  T1 D  versus  seroconverted  individuals,  including  only  samples  between  the 
seroconversion  and  diagnosis  windows.  FDR-corrected  p  values  (Q  values)  were  calculated  using  MaAsLin.  The  box  represents  the  first  and  third  quartiles;  error 
bars  indicate  95%  confidence  of  median. 

(B  and  C)  (B)  Spearman  correlations  between  serum  triglycerides  and  the  five  most-correlated  genera,  and  (C)  between  the  branched-chain  amino  acids  and 
OTUs  using  a  cutoff  of  p  <  0.001 .  +,  correlations  with  p  <  0.05;  *,  correlations  with  Q  <  0.05.  “TG*"  represents  TG(14:0/1 8:1/1 8:1)  +  TG(1 6:0/1 6:1/18:1). 

(D)  Spearman  correlations  between  stool  metabolites  and  lipids  and  the  most-correlated  genera.  Coefficients  of  the  canonical  variates  including  Ruminococcus, 
Veillonella,  and  correlated  metabolites  obtained  using  penalized  canonical  correlation  analysis.  See  also  Figure  S4. 
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